NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Foundation Models for Archaeological Feature Detection: Advances and Prospects

Guo, Junlin; Huo, Yuankai; Zimmer-Dauphinee, James; Nieusma, Jordan; Lu, Siqi; Wernke, Steven A; VanValkenburgh, Parker (May 2025, Computer Applications in Archaeology Conference 2025)

To date, Deep Learning models for archaeological feature detection have generally been built on the back of off-the-shelf convolutional neural networks (CNNs) and vision Transformer (ViT) models, which are pretrained on a variety of image types, sources, and subjects that are not specific to analyzing high-resolution satellite imagery. Recent advances in transformer-based vision models and self-supervised training approaches make it possible for researchers to generate foundation models that are more finely attuned to specific domains, without huge amounts of human-annotated training data. We discuss the development of two such models employing Meta's transformer-based DINOv2 framework. The first, DeepAndes, is based on the ingestion of a 3 million chip sample from a two million square km area of high-resolution multispectral satellite imagery of the Andean region. This foundation model has broad utility across the social and earth sciences. The second, DeepAndesArch is fine-tuned labeled archaeological training data collected by the GeoPACHA project to create an archaeology-focused version of DeepAndes. We present the processes involved in generating DeepAndes and DeepAndesArch and discuss prospects for foundation models in archaeological research
more » « less
Free, publicly-accessible full text available May 7, 2026
Vision Foundation Models in Remote Sensing: A survey

https://doi.org/10.1109/MGRS.2025.3541952

Lu, Siqi; Guo, Junlin; Zimmer-Dauphinee, James R; Nieusma, Jordan M; Wang, Xiao; vanValkenburgh, Parker; Wernke, Steven A; Huo, Yuankai (January 2025, IEEE Geoscience and Remote Sensing Magazine)

Full Text Available
Semi-supervised contrastive learning for remote sensing: identifying ancient urbanization in the south-central Andes

https://doi.org/10.1080/01431161.2023.2192879

Xu, Jiachen; Guo, Junlin; Zimmer-Dauphinee, James; Liu, Quan; Shi, Yuxuan; Asad, Zuhayr; Wilkes, D. Mitchell; VanValkenburgh, Parker; Wernke, Steven A.; Huo, Yuankai (March 2023, International Journal of Remote Sensing)

Archaeology has long faced fundamental issues of sampling and scalar representation. Traditionally, the local-to-regional-scale views of settlement patterns are produced through systematic pedestrian surveys. Recently, systematic manual survey of satellite and aerial imagery has enabled continuous distributional views of archaeological phenomena at interregional scales. However, such ‘brute force’ manual imagery survey methods are both time- and labour-intensive, as well as prone to inter-observer differences in sensitivity and specificity. The development of self-supervised learning methods (e.g. contrastive learning) offers a scalable learning scheme for locating archaeological features using unlabelled satellite and historical aerial images. However, archaeological features are generally only visible in a very small proportion relative to the landscape, while the modern contrastive-supervised learning approach typically yields an inferior performance on highly imbalanced datasets. In this work, we propose a framework to address this long-tail problem. As opposed to the existing contrastive learning approaches that typically treat the labelled and unlabelled data separately, our proposed method reforms the learning paradigm under a semi-supervised setting in order to fully utilize the precious annotated data (<7% in our setting). Specifically, the highly unbalanced nature of the data is employed as the prior knowledge in order to form pseudo negative pairs by ranking the similarities between unannotated image patches and annotated anchor images. In this study, we used 95,358 unlabelled images and 5,830 labelled images in order to solve the issues associated with detecting ancient buildings from a long-tailed satellite image dataset. From the results, our semi-supervised contrastive learning model achieved a promising testing balanced accuracy of 79.0%, which is a 3.8% improvement as compared to other state-of-the-art approaches.
more » « less
Full Text Available

Search for: All records